Similarity based deduplication with small data chunks
نویسندگان
چکیده
منابع مشابه
Similarity Based Deduplication with Small Data Chunks
Large backup and restore systems may have a petabyte or more data in their repository. Such systems are often compressed by means of deduplication techniques, that partition the input text into chunks and store recurring chunks only once. One of the approaches is to use hashing methods to store fingerprints for each data chunk, detecting identical chunks with very low probability for collisions...
متن کاملOptimal Partitioning of Data Chunks in Deduplication Systems
Deduplication is a special case of data compression in which repeated chunks of data are stored only once. For very large chunks, this process may be applied even if the chunks are similar and not necessarily identical, and then the encoding of duplicate data consists of a sequence of pointers to matching parts. However, not all the pointers are worth being kept, as they incur some storage over...
متن کاملSimilarity-based Deduplication for Databases
dDedup is a similarity-based deduplication scheme for on-line database management systems (DBMSs). Beyond block-level compression of individual database pages or operation log (oplog) messages, as used in today’s DBMSs, dDedup uses byte-level delta encoding of individual records within the database to achieve greater savings. dDedup’s single-pass encoding method can be integrated into the stora...
متن کاملCloud Based Data Deduplication with Secure Reliability
IJRAET Abstract— To eliminate duplicate copies of data we use data de-duplication process. As well as it is used in cloud storage to minimize memory space and upload bandwidth only one copy for every file stored in cloud that can be used by more number of users. Deduplication process helps to improve storage space. Another challenge of privacy for sensitive data also arises. The aim of this pap...
متن کاملFrom Chunks to function-Argument Structure: A Similarity-Based Approach
Chunk parsing has focused on the recognition of partial constituent structures at the level of individual chunks. Little attention has been paid to the question of how such partial analyses can be combined into larger structures for complete utterances. Such larger structures are not only desirable for a deeper syntactic analysis. They also constitute a necessary prerequisite for assigning func...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Discrete Applied Mathematics
سال: 2016
ISSN: 0166-218X
DOI: 10.1016/j.dam.2015.09.018